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Abstract — In this paper, the problem of communicating using 
chemical messages propagating using Brownian motion, rather 
than electromagnetic messages propagating as waves in free 
space or along a wire, is considered. This problem is motivated 
by nanotechnological and biotechnological applications, where 
the energy cost of electromagnetic communication might be 
prohibitive. Models are given for communication using particles 
that propagate with Brownian motion, and achievable capacity 
results are given. Under conservative assumptions, it is shown 
that rates exceeding one bit per particle are achievable. 

I. Introduction 

In most existing forms of engineered communication, mes- 
sages are transmitted over electromagnetic carriers. Although 
this form of communication has been remarkably successful, 
the emerging field of nanotechnology poses communication 
challenges for which electromagnetic communication might 
be unsuitable. For example, in a conducting fluid (such as 
blood, or seawater), electromagnetic waves cannot propagate, 
while alternatives (such as sonar) may be problematic for very 
small devices. Furthermore, electromagnetic communication 
generally imposes an energy cost which might be undesirable. 

In nature, it is very well understood that chemical com- 
munication is used for communication between nanoscale 
"machines", such as cells or microbes. This form of communi- 
cation is desirable in biological systems owing to its simplicity 
and low energy cost. One such method is quorum sensing, 
in which bacteria exchange messages intended to determine 
roughly the local population of their species [1]. This form 
of communication has attracted the attention of engineers: in 
[2], the genetic sequences of these communication components 
were isolated, with the intention of using them to allow com- 
munication and co-operation between engineered microbial 
"robots"; or to force them to carry out chemical functions 
analogous to logic gates [3]. Recent work has attempted to 
characterize this pathway as a linear communication channel 
[4]. 

Generally, the biological literature has attempted to explain 
the function of chemical messaging, rather than exploiting it 
for artificial purposes. Our contribution in this paper is to 
obtain models for chemical communication channels, and give 
achievable capacity values for those channels. As such, the 
purpose of this paper is to determine the feasibility of this 
type of communication in nanoscale devices. These channels 
are essentially timing channels, in which the noise is the delay 
between releasing a particle into the medium and observing its 
arrival, so previous work on queue timing channels [5], [6] is 



closely related. Furthermore, work on diffusion channels has 
been carried out by Berger (e.g., [7]), though the aim of his 
work is to analyze biochemical processes through the lens of 
information theory. 

The chemical channel is a practically interesting system 
which is poorly understood from the perspective of communi- 
cation. In particular, it is currently unknown how to model this 
channel, and it is therefore useful to know its physical limits 
in terms of information-theoretic capacity. Furthermore, even 
though the computational capabilities of very tiny machines 
are currently rudimentary (which restricts the use of coding, or 
complicated modulation), the capacity gives a very loose upper 
bound on the uncoded capabilities of the system, and gives a 
rough idea as to the potential of chemical communication. 

II. Model 

A. Basic assumptions 

We consider a chemical communication system as in Figure 
Q] The transmitter has a reservoir of particles, and forms 
messages by releasing particles at a vector of transmission 
times x = [x±,X2, ■ ■ ■ ,xg], where Xi is the time of release of 
the ith particle, and £ is the total number of particles released 
to convey the message. There is a distance d > between the 
transmitter and the receiver. On release, each particle enters 
a fluid medium between the transmitter and receiver, and the 
position of the ith particle at any time t > Xj is given by a 
Brownian motion Bi(t). 

The following are the key assumptions of this system: 

• The transmitter perfectly controls the departure time of 
each particle. After release, the transmitter is "transpar- 
ent" to the particles, so that there is no effect on the 
particles if they cross the origin. 

• The particle propagates until the first hitting time at the 
receiver (i.e., the smallest t such that Bi(t) = d). The 
receiver perfectly observes the hitting time and removes 
the particle from the system. 

• The medium between the transmitter and receiver is one- 
dimensional, with the transmitter at the origin and the 
receiver at d > 0. The medium is semi-infinite, defined on 
(— oo, d]. (Particles can never achieve a position greater 
than d, since they are removed at their first hitting time.) 

• For particles i and j, the paths Bi(t) and Bj(t) are 
independent if i ^ j. 

This is an idealized channel model which simplifies the system 
and eliminates all possible sources of noise other than the 
transmission time. 
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Fig. 1. A chemical communication system. A particle is propagating from 
transmitter (Tx) to receiver (Rx) using Brownian motion. 



The transmission time ti of each particle is a random 
variable defined as the length of time between release from 
the transmitter and the first hitting time at the receiver. If the 
Brownian motion channel is considered to be a timing channel, 
then frp t (fj) can be thought of as the PDF of the noise process. 
The distribution of the first hitting time is the only stochastic 
property of Brownian motion that we require for this analysis. 

B. Channel input-output relationship 

From Section III-A1 x represents a vector of particle release 
times, and the particle released at time x^ arrives at time Xi+U. 
Letting t = [ti, t 2 , ■ • ■ , t n ] represent the vector of transmission 
times for each particle, we can form the vector 



t. 



(1) 



where m = Xi + ti represents the first hitting time of the ith 
particle at the detector. 

However, the detector does not observe u directly, because 
u is stated in the order that the particles were released, which 
is not necessarily the same as the order in which the particles 
hit the detector. Instead, the detector observes 



y := sort(u), 



(2) 



where the function sort(-) takes a vector argument, and returns 
the vector sorted in increasing order. 

Suppose the particles are distinguishable - that is, every 
particle carries a unique label. (In this paper, we assume that 
the transmitter always releases labeled particles in the same 
order, so the labels carry no information related to x, but this 
assumption can be relaxed.) Then the detector observes the 
pair of vectors (y, b), where b is a vector of labels, and where, 
for all i, hi is the label attached to the particle that arrived at 
time yi. 

Obviously, if every element of b is unique, the detector 
can use the pair (y, b) to recover the vector u from (Q]). In 
this case, since the transmission times £j are all independent, 
the channel is equivalent to an additive noise channel. We 



write 7r(y, b) as the inverse of the sorting operation, so that 
u = 7r(y, b) is the permutation of y that restores the original 
order of the labels. Thus, we have that 



/(y,b I x) = JJ/ T .( Uj ~ x i)> 



(3) 



so long as y is in increasing order (the probability is zero 
otherwise). This channel may be handled in the same manner 
as any additive noise channel, and we give some example 
capacity calculations in Section HVl 

Now suppose the particles are indistinguishable, so that b 
is not available to the detector. In the following example, 
we derive the PDF f(y | x), from first principles, for two 
indistinguishable particles: 

Example 1: Let x = [2:1,252] represent the release times of 
two indistinguishable particles, and let y = [2/1,2/2] represent 
the first hitting times of these particles, sorted in order of 
arrival at the detector, as in (0. Then we may write 

/(y I x) 

= /(j/1,2/2 I X lt X 2 ) 

= f{yi,y 2 \xi,x 2 ,u 1 < u 2 )Pr(ui < u 2 \x x ,x 2 ) + 
/(2/1, 2/2^1, z 2 ,ui > w 2 )Pr(ui > u 2 \xi,x 2 ). (4) 

Now consider /(2/1, y 2 \xi, x 2 , Ui < u 2 ). If we know that u\ < 
u 2 , then we know that y = sort([ui, u 2 ]) = [ui, u 2 ], so ui = 
2/i and u 2 =y 2 . Thus, 



f{Vl,V2\Xl,X 2 ,Ui < u 2 ) = 

fu 1 ,u 2 \x 1 ,x 2 ,u 1 <u 2 (yi,y2\xi,x 2 ,ui < u 2 ). 
Furthermore, from Bayes' rule, 



(5) 



f(ui,U 2 \x!,X 2 ,Ul < U 2 ) 

( f(u 1 ,u 2 \x 1 ,x 2 ) 
= / Pr( ui <U2 \xi ,3:2) ' 

I 0, 



Ui < u 2 ; 

Ul>U 2 - 



(6) 



Thus, from (0 and ©, 



f(yi,y2\xx,x 2 ,u 1 < u 2 ) = 

{ fui,u 2 \x 1 ,x 2 (yi,y2\xi,x 2 ) 
0, 

By a similar argument, we can write 

f(yi,y 2 \xi,x 2 ,u 1 > u 2 ) = 

( fu 1 ,u 2 lx 1 ,x 2 (y2-Vi\xi.x 2 ) 
J Pr(wi >u 2 \xi ,x 2 ) 

I 0, 



2/1 < 2/25 

yi > 2/2- 



(7) 



2/2 > mi; 
2/2 < 2/i- 

Substituting (0 and ^ into (0]), we can write 



(8) 



/(y|x) = 

fu 1 ,u 2 \x 1 .x 2 (yi,y2\xi,x 2 ) + 
fu 1 ,u 2 \x 1 ,x 2 (y2,yi\xi,x 2 ), 



2/1 < 2/2; 



(9) 



0, 



2/i > 2/2- 
(End of example.) 



Returning to ©, we see that the same expression is found 
by taking the sum over all possible values of b: 

/(y|x) = £/(y,b|x) 
hev 

{ Ebep/uMy,t>) |x), y = sort(y), 
1 0, y f sort(y); 

(10) 

where V represents all possible permutations of n letters. In 
the case where n ~ 2, as in Example [1] there are only two 
possible permutations, and we immediately see that ( fTOb is 
equivalent to (0. 

It can be shown that exact calculation of the PDF in (TlOb 
is equivalent to taking the permanent of an n x n matrix. 
Calculating the permanent is known to be a member of the 
class of #P-complete problem^] [8], which are known to be 
intractable for large n. 

C. Discrete-time model 

Instead of observing the exact arrival times of each particle, 
suppose we have a discrete-time model with the following 
properties: 

• Time is partitioned into intervals, indexed by I = 
{1, 2, ....... , |X|}, each of duration t. 

• Particles are only released at the beginning of an interval. 
For i £ I, the vector r = [n, r2, . . . , rm] gives the 
number of particles released at the beginning of each 
interval, where for i € X, ri represents the number of 
particles released at the beginning of the zth interval. 

• The detector reports the count of the number of par- 
ticles that arrive on each interval. The vector c = 
[ci, C2, . . • , C|i|] gives the counts in each interval, where 
for i S I, Ci represents the number of particles that 
arrived in the ith interval. 

This model is no less intractable as compared to the 
continuous-time model. However, we will see in Section [III] 
that a reasonably good (and tractable) approximation exists 
for this model, which leads to a lower bound on the capacity 
of the system. Even for the exact discrete-time model, it is 
obvious that such a model leads to a lower bound on the 
system capacity. 

D. Statistical model of transmission time 

To model the diffusion process from the transmitter to the 
receiver, we use the Wiener process. There are better physical 
models for Brownian motion, but the Wiener process has the 
advantage that the PDF of the first hitting time t of each 
particle can be expressed in closed form. In the remainder of 
the paper, none of the techniques depend on this particular 
PDF for the first hitting time i, so it changes nothing to 
substitute it for any other model for the first hitting time, or 
to include such things as a Brownian motion with drift. 

A Wiener process w(t) is a continuous-time random process 
where, for t' > t and for some constant a 2 , w(t') — w(t) is 

1 #P-complete is pronounced "sharp-P complete". 



Gaussian distributed with zero mean and variance a 2 (t' — t); 
and where the increment w(t') — w(t) on the interval [t,t'] is 
independent of the increment on any other disjoint interval. We 
assume that u>(0) = 0, and that w(t) is undefined for £ < 0. It 
is a well-known result for the Wiener process (see, e.g., [9]) 
that this first hitting time (i.e., the transmission time), written 
ti for the ith particle, has a PDF given by 

r o, u < o, 

f(ti) = { d ( d 2 \ t>0 (id 

From ( fTTT i. /(ti) has an extremely long tail that decays as 
B(t~ 3/2 ). The mean, and all other moments of this density, 
are equal to 00. As a result, if a detector waits for all particles 
to arrive before decoding a message, the average waiting time 
will be 00, which means that the average data rate, in bits per 
second, could be zero. In such a case, it may make sense to 
define a transmission interval T, and declare any particle with 
transmission time ti > T to be lost. 

The constants d and a 2 depend on the physical properties 
of the system. In the remainder of the paper, we will assume 
for simplicity that d = a 2 = 1. 

III. Capacity bounds 

A. Simplified systems 

We firstly consider the following simplified systems, calcu- 
lating capacity in bits per unit time for an unbounded number 
of particles, and capacity in bits per particle for unbounded 
time. In both cases the capacity is infinite: 

• Unbounded number of particles. The particle-release 
channel is an infinite-server queue, so we can use a 
similar argument to the calculation of the infinite-server 
queue capacity [6] to show that its capacity, in bits per 
unit time, is 00. 

• Unbounded time. We can take an interval of time T 
and divide it into segments of length log T. Using pulse- 
position modulation, a message is sent by transmitting a 
single particle at the beginning of one of the T/logT 
segments. As T — > 00, the particle arrives within the 
same segment with Pr = 1, allowing the error-free trans- 
mission of log 2 (T/ logT) bits; and since T j logT — > 00 
as T — > 00, so does log 2 (T/ logT), so the capacity, in 
bits per particle, is 00. 

Since both time and particles are precious resources, one 
might consider capacity per unit time and per particle. Fur- 
thermore, releasing an enormous number of particles at once is 
impractical, so we can consider limitations on the transmission 
rate of the particles (e.g., the transmitter is allowed to release 
at most one particle per unit time). We will consider both of 
these circumstances in Section HVl 

B. Labeled particles 

As we indicated in Section [II] the calculation of /(y | x) 
is intractable. However, if the vector b of permuted labels is 
observed, then /(y, b|x) is both tractable and straightforward. 
From (O, knowledge of b and y recovers u, and separates 



each particle into an independent channel with input Xi and 
output Ui. Thus, 7(Y,B;X) can be calculated straightfor- 
wardly, as for any additive independent channel. 

The operation of "labeling" a particle might be costly. For 
instance, it might be accomplished by maintaining a reservoir 
of unique particles, or by synthesizing a novel particle for 
each element of x. As a result, we can consider labellings 
that use fewer unique elements. For instance, suppose every 
second particle has a unique label. Now, the vector b does not 
exactly recover u, but partitions the vector y into independent 
channels containing pairs of indistinguishable particles, but 
where the pair of particles in each channel is distinguishable 
from the particles in every other channel. Such a scheme would 
use half as many labels as a scheme where every particle is 
uniquely labeled. 

We use the notation to indicate that every jth label 
in the vector is unique. That is, as n — > oo, b^ contains 
n/j unique labels. To be consistent with our notation from 
Section HH we let b' 1 ) := b. We will calculate some example 
capacities for such channels in Section [IV] but the following 
proposition gives a straightforward ordering of labellings in 
terms of mutual information: 

Proposition 1: If j < k, then 7(Y,B^;X) > 
7(Y,BW;X). 

Proof: Suppose the total number of particles is n. 
Consider a labeling bw corresponding to the sequence of 
first hitting times y, recalling that h^' contains n/j unique 
labels. Suppose, between the transmitter and receiver, there 
is an entity that modifies the labels (without modifying the 
particle trajectories), as follows. A fraction n/j — n/k of 
labels are selected, uniformly at random from all possible 
such selections, (leaving n/k labels unselected); the particles 
in these labels are then divided (uniformly at random) into 
n/k groups, corresponding to the unselected labels. The labels 
on the particles are then replaced with a label from the n/k 
unselected labels, such that each group receives a unique 
unselected label. The result is a labeling b' fe ' with n/k unique 
labels. In the limit as n — > oo, the effect of non-integer 
quotients from any of these divisions is negligible. 

Since the relabeling process is a reversible physical process, 
which is independent of x, the system with labeling is 
physically degraded with respect to the system with labeling 
b"), which is sufficient to prove the proposition. ■ 
An obvious corollary of Proposition [TJ is that the system in 
which every particle is distinguishable has the largest capacity 
of any possible such system. Also, since this is a mutual 
information result rather than a capacity result, it is true for 
any possible input distribution. 

C. Bounds from approximate PDFs 

As we have seen, calculating the exact PMF of the random 
process y is intractable for any practical number of particles. 
However, the process is straightforward to generate: given 
a vector x of release times, we simply generate random 
transmission times for each particle, and sort the result in 
increasing order. Thus, performing monte carlo expectations 



of any tractable function of y can be accomplished with 
reasonable complexity. 

The mutual information between the random variables X 
and Y can be written 



7(X;Y)=£ 



log 



/(y,x) 



/(y)/(x) 



(12) 



Of course, taking the monte carlo expectation of this function 
results in no complexity advantage, since the function is still 
intractable. 

However, suppose we replace /(y,x) with a tractable 
approximation g(y, x), with the following properties: 

• J x J y 9(y,x)dy dx = 1 (i.e., .g(y,x) is a PDF); and 

• Jy 9(y> x )dy = /(x) (i.e., the correct marginal distribu- 
tion of x is preserved). 

Then we could write 



I(X;Y)«£ 



log 



s(y,x) 



(/ x5 (y,x)dx)/(x) 



(13) 



and, since g(y, x) is tractable, it would be possible to calculate 
this approximation to 7(X; Y) using monte carlo methods. 

In fact, we can show that the approximation in (fT3l is a 
lower bound: 

Proposition 2: For any PDF g(y, x) satisfying the above 
properties, 



7(X;Y) > E 



log 



ff(y,x) 



with equality if and only if /(y,x) 
Proof: We can rewrite (fT4T > as 



(/ x5 (y,x)dx)/(x) 

(y,x). 



(14) 



E 



log- 



ff(y>x) 



(J x5 (y,x)dx)/(x) 
= H(K) +E [log g(x\y)] 
= 77(X)-77(X|Y)-7)(/(x|y) || ff (x|y)) 
= 7(X;Y)-£>(/(x|y) || ff(x|y)) , (15) 

where D(f || g) represents Kullback-Leibler (KL) divergence. 
The proposition immediately follows from ( TT3T > and the prop- 
erties of KL divergence. ■ 
Since Proposition [2] gives a lower bound on mutual infor- 
mation, it also gives a lower bound on capacity for any input 
distribution /(x). 

D. Approximate discrete time model 

In Proposition [2] any PDF satisfying the given properties 
can be used. However, dT3T > tells us to look for an approximate 
PDF that minimizes the KL divergence to the true density 
/(y,x) (or, in the case of the discrete time model, (c, r)). 
Thus, it is reasonable to look for a tractable density that 
reasonably approximates /(c, r). 

We can modify the discrete time model from Section IH-CI 
as follows. Suppose a single particle is transmitted at the 



beginning of an interval r. Its probability of arriving during 
that interval is given by 



Parr — 



f(t)dt, 



t=0 



where fit) is specified in ( fTTT i. Thus, the probability that the 
particle will arrive in a different interval is given by 1 — p aTT . 

In the ith interval, the discrete-time counting detector forms 
the observation 



where, assuming at most one particle is released, 

0, r< 

Parr i T% 



Pr(fi = 1) = 



0, 
1. 



(16) 



(17) 



and where Zj is a Poisson-distributed random variable with 
arrival rate 

X = E[ ri }(l- PaxT ). 

In other words, Zi is a "background" arrival rate for particles 
in the system, as an average of E^r^l — Pan) particles will 
arrive as a result of particles that did not arrive in the interval 
in which they were transmitted. A similar model was used to 
approximate the process of corn pollen dispersal [10]. 

We can modify this model in an interesting way to achieve 
higher fidelity. The probability that a particle will arrive in the 
kth interval after its transmission is given by 

rkr 

f{t)dt. 

It={k-l)r 
(k) 

Let f\ ' represent the analog of the previously defined r,;, 
where 



-Fair 



Pr(r> 



(fc) 



1) 



0. 

(fc) 

Parr j 



n = o, 
n = 1. 



Now, the counting process Cj is given by 

JV-l 
3=0 

and Zi is a Poisson-distributed random variable with arrival 
rate 



A = E\n 



1 



i=i 



In other words, this channel has a sense of intersymbol 
interference. These models can be easily generalized to the 
case where more than one particle is released at the beginning 
of an interval. 

IV. Examples 

In this section we present two examples of achievable 
results, under conditions suggested in previous sections. 

Example 2: Consider a system with labeled particles and 
no rate restrictions on particle release. Our strategy is as 
follows: release each labeled particle on the interval [0, T], 
and wait until T for particles to arrive; if they have not yet 
arrived by time T, the particles are declared missing. We 



- Bits per particle (1) 
Bits per particle (2) 

Bits per particle per second (1 ) 

Bits per particle per second (2) 




Fig. 2. Mutual information results for rate-unlimited systems. Systems 
designated (1) have unique labels for every particle, and systems designated 
(2) have unique labels for every second particle. 



calculate I(Y, B; X)/T to obtain the capacity, per particle per 
second, for the case where every particle is labeled uniquely. 
For comparison (and to demonstrate Proposition [TJ, we also 
include a case for I(Y, B^ 2 ) ; X)/T. In both cases, we consider 
uniform transmission of particles on the interval [0, T]. 

Results are depicted in Figure [2] In the figure, we see that 
the mutual information per particle increases monotonically 
with T, as expected, but that the mutual information per 
particle per second reaches a maximum value. Furthermore, 
as expected, the labeling has smaller mutual information 
than the labeling b. (End of example.) 

In future work, we will optimize the input distribution p(x), 
but preliminary results indicate that the optimized distribution 
is probably close to the uniform distribution. 

In Example [2] we assumed that particles were distinguish- 
able, and that there was no restriction on the number of 
particles released in an interval of time. Thus, Example [2] 
is calculated on the assumption that an infinite number of 
distinguishable particles (or distinguishable pairs of particles) 
are released in the time interval [0, T}. In the following 
example, we use Proposition [2] and the approximate discrete- 
time model from Section ITlI-Dl to present an achievable mutual 
information result under a (more practical) constraint on the 
rate of particle release. 

Example 3: Suppose our system operates under an average- 
case particle release constraint that, on average, at most five 
particles can be released per second. To achieve this constraint 
using the discrete-time model, we will release at most one 
particle per interval, and select r and p(r) accordingly (we 
will assume that the probability of transmitting a symbol is 
independent from interval to interval). 

We use the "inter-symbol interference" approximate model 
with N = 2, which appears to have the best results of this class 
of models. In Figure [5] we present a lower bound on mutual 
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Fig. 3. Mutual information per second with respect to p(x) for various 
values of r. 
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Fig. 4. Mutual information per particle with respect to p(x) for various 
values of r. 



information per second, and in Figure @] we present a lower 
bound on mutual information per particle, in both cases using 
monte carlo expectation and Proposition [2] As expected, the 
capacity per particle is highest when p(r) is small, meaning 
that there are very few particles in the system (this follows 
from our argument in Section IIII-Ab . However, the capacity 
per unit time is small when p(r) is small. Also, as expected, 
the bound on mutual information per unit time increases as r 
increases, but reaches a maximum around t = 1, representing 
the balance between discernibility of the particles and the long 
interval between particles. (End of example.) 

A remarkable consequence of Example [3] and Figure H is 
that, under practical assumptions and at the maximum rate 



of bits per second, transmission of a message of k bits re- 
quires roughly 3k particles. Thus, considering a system where 
molecules play the role of particles, a 1000-bit message can 
be transmitted by carefully releasing roughly 3000 molecules. 
Clearly, this requires very little energy and very little mass, 
which is ideal for nanoscale machines. In our normalized 
system where <i = er 2 = 1, it takes roughly 36000 seconds 
for the 3000 molecules to arrive, but a tiny value of d, which 
is appropriate for a nanoscale machine, would likely increase 
that rate significantly. 

V. Conclusion 

This paper has explored the prospects for communication 
using particles that propagate across a medium using Brownian 
motion. Useful models and techniques have been derived 
which indicate that this is a feasible model for a communica- 
tion system. However, much work remains to be done to create 
a practical system. Firstly, optimized input distributions need 
to be derived for these various methods. Furthermore, in the 
direction of Proposition [2] optimized tractable approximations 
need to be derived. Most importantly, practical methods of 
applying these results to communication in an actual nanoscale 
system need to be obtained, taking into account the complexity 
and energy constraints present in those systems. 
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